{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Any learning algorithm will always have strengths and weaknesses: a single model is unlikely to fit every possible scenario. Ensembles combine multiple models to achieve higher generalization performance than any of the constituent models is capable of. How do we assemble the weak learners? We can use some sequential heuristics. For instance, given the current collection of models, we can add one more based on where that particular model performs well. Alternatively, we can look at all the correlations of the predictions between all models, and optimize for the most uncorrelated predictors. Since this latter is a global approach, it naturally maps to a quantum computer.\n",
    "\n",
    "# Ensemble methods\n",
    "\n",
    "Ensembles yield better results when there is considerable diversity among the base classifiers. If diversity is sufficient, base classifiers make different errors, and a strategic combination may reduce the total error, ideally improving generalization performance. A constituent model in an ensemble is also called a base classifier or weak learner, and the composite model a strong learner.\n",
    "\n",
    "The generic procedure of ensemble methods has two steps. First, develop a set of base classifiers from the training data. Second, combine them to form the ensemble. In the simplest combination, the base learners vote, and the label prediction is based on majority. More involved methods weigh the votes of the base learners. \n",
    "\n",
    "Let us import some packages and define our figure of merit as accuracy in a balanced dataset."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2018-11-19T20:10:18.000793Z",
     "start_time": "2018-11-19T20:10:17.128450Z"
    }
   },
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "import sklearn\n",
    "import sklearn.datasets\n",
    "import sklearn.metrics\n",
    "%matplotlib inline\n",
    "\n",
    "metric = sklearn.metrics.accuracy_score"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We generate a random dataset of two classes that form concentric circles:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2018-11-19T20:10:18.174692Z",
     "start_time": "2018-11-19T20:10:18.003641Z"
    }
   },
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAWQAAAFbCAYAAADiN/RYAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDMuMC4yLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvOIA7rQAAEuVJREFUeJzt3T9vW0fWB+AjYQMCbtRk4Y4i1KRKlybdAq7zCdS73l7FIgV71+75CVIbcKcmqVylMUx1wr6NGgNCAPMtvIxsi5Yp3j9z5s7zlIJIU7xzfhjPnTn3aLPZBADlHZf+AAB8JJABkhDIAEkIZIAkBDJAEgIZIAmBDJCEQAZIQiADJPGPx/zy999/v1ksFgN9FIBp+uOPP/5vs9n881u/96hAXiwW8fvvvx/+qQAadHR0tN7n9yxZACQhkAGSEMgASQhkgCQEMkASAhkgCYEMkIRABkhCIAMkIZABkhDIAEkIZNJard7EYvEijo9/jcXiRaxWb1K8FwzlUc2FYCyr1Zt4/vy3eP/+r4iIWK9v4vnz3yIi4vz8x2LvBUMyQyali4tXfwfo1vv3f8XFxaui7wVDEsj0qq+lgaurm0f9fKz3svTBkAQyvdkuDazXN7HZ3C0NHBJa8/nJo34+xnv1+ffBLgKZ3vS5NLBcPosnT7777GdPnnwXy+WzYu9l6YOhCWR60+fSwPn5j/Hy5S9xenoSR0cRp6cn8fLlLwfdhOvrvfr8+2AXuyyI1epNXFy8iqurm5jPT2K5fHZQ8M3nJ7Fe3w+nQ5YZIj4GaV+7IPp4r77/vr6+d6bDDLlxfa6L9rnMkFGff5/1aHYRyI3rc120z2WGjPr8+6xHs8vRZrPZ+5d/+umnjadOT8vx8a+xawgcHUV8+PCf8T9QI3zvbTk6Ovpjs9n89K3fM0NuXJ/by9if751dBHLjpr7um5XvnV0EcuOmvu6ble+dXawhV87Wqba5/nXYdw3ZPuSK6WLWNtd/eixZVMzWqba5/tMjkCvmKG/bXP/pEcgVs3Wqba7/9Ajkitk61TbXf3oEcsVsnWqb6z89tr0BDMzRaYDKCOSCPJ+NDIzDPBwMKcSmfjIwDnMxQy7Epn4yMA5zEciF2NRPBsZhLgK5EJv6ycA4zEUgF2JTPxkYh7kI5EJs6icD4zAXB0NIb3V9HRdv38bV7W3MZ7NYnp3F+dOng78W+qIfMpOwur6O53/+Ge8/fIiIiPXtbTz/88+IiG8Ga5fXQgmWLBjc6vo6FpeXcfz6dSwuL2N1fb33ay/evv07ULfef/gQF2/fDvrarp8bDmGGzKC6zlKvbm8f9fO+Xmt2TQlmyB05dvqwrrPU+Wz2qJ/39dqun7sVxn+/BHIH22On6/VNbDZ3x06nOigP+S98l1lqRMTy7CyeHH8+TJ8cH8fy7GzQ13adXbew1NHa+B+DQO6gpWOn2//Cr29vYxN3/4X/Vth0maVGfFweePnDD3E6m8VRRJzOZvHyhx/2Wjbo8tpDP/eh31ONWhr/Y7HtrYPj419j19d3dBTx4cN/xv9AA1pcXsZ6x+zwdDaLdz///NXXfbkWG/FxlrpvMJZy6Oc+9HuqUUvjvyv9kEfQ0rHTQ/8L32WWWtKhn7vrEk1NWhr/Y7HLooPl8tlnrQsjpnvsdD6b7Zz57bP0cP70afoA3uWQz93le6pNS+N/LGbIHbR07LTLDbKWtPQ9tTT+x2INuUGHHid2DHk/vl++tO8askBuTK032abOdZk2N/XYyYGHnFwXIgRyc1raBVAT14UIgdycrgc1GIbrQoRA/lsrZ/Jb2gVQk9auSyv19lj2IUdbj0Lf3iByNz+Xlq5LS/X2WHZZRMRi8SLW6/tP2T09PYl37/5d4BPtxzYpIuobB7XWWxeeGPIINT4KXb9eIuocBzXW21isIUedZ/JtkyKiznFQY72NRSBHnY9Ct02KiDrHQY31NhaBHHWeybdNiog6x0GN9TYWN/Uq5agtEcZBLRydnrha+wzTL+NgWsyQAQZmhlyhVh6OSVnGWV72ISdR435S6mOc5WaGnESN+0mpj3GWm0BOosb9pNTHOMtNICdR435S6mOc5TbZQK6tvV9r7Rcpo9ZxVls9H2qSN/VqbO/XUvtFyqlxnNVYz4ea5D7kFtv7wVRNoZ6b3oesvR9MR0v1PMlA1t4PpqOlep5kIGvvB9PRUj1PMpAztfdzTJWaZRi/mep5aJO8qZeF1ojUzPjtT9M39bJwTJWaGb/jE8gDckyVmhm/4xPIA3JMlZoZv+MTyAOq9ZgqRBi/JQjkAXm8DjUzfsdnlwXAwOyyAKhMdYHcShs+4PFqz4eq2m+21IYPeJwp5ENVM+SLi1d/f9lb79//FRcXr0b/LBmOlEIWGeohUz4cqqoZcpY2fJ7cC3ey1EOWfOiiqhlyljZ8jpTCnSz1kCUfuqgqkLO04XOkFO5kqYcs+dBFVYGcpQ2fI6VwJ0s9ZMmHLhwMOYC2hHBHPXybgyEDcqQU7qiH/pghAwzMDBmgMgIZIAmBDJCEQAZIQiB/RYaz+TAV6mk/VfWyGEuWs/kwBeppf2bIO2Q5mw9ToJ72ly6QMzSYznI2H6YgUz1lyJeHpFqyyNJgej6bxXrHYNGrAh4vSz1lyZeHpJohZ2kw7fHn0J8s9ZQlXx6SKpCzNJh2Nh/6k6WesuTLQ1ItWcznJ7Fe3/9ySjSYPn/6VABDTzLUU6Z8+ZpUM+QpNJgGcqohX1IF8hQaTAM51ZAv2m8CDEz7TYDKCGSAJAQyQBLNB7IuVFCO+vtcqn3IY9OFCspRf/c1PUPWhQrKUX/3NR3ImbpQQWvU331NB/LXuk3p6gbDU3/3NR3IWbpQQYvU331NB3KWLlTQIvV3X7Gj06vVm7i4eBVXVzcxn5/Ecvks1ZlyoB1D59G+R6eLbHuroXM/0IZMeVRkyaKGzv1AGzLlUZFArqFzP9CGTHlUJJC/1qE/U+d+oA2Z8qhIINfQuR9oQ6Y8KhLINXTuB9qQKY+aemLI6vo6Lt6+javb25jPZrE8O2t6zyNkNMU6Tb3trQSdpSC/1uu0mZN6OktBfq3XaTOBrLMU5Nd6nTYTyDpLQX6t12kzgayzFOTXep02E8g6S0F+rddpU9veAErYd9tbMzNkgOwEMkASAhkgidECebV6E4vFizg+/jUWixexWr0Z658GeJRSeTXK0elMHfkBHlIyr0aZIWfqyA/wkJJ5NUogl+zIv7q+jsXlZRy/fh2Ly8tYXV8P/m8C/ShRvyXzapRALtWRf9s5an17G5u46xwllCG/UvVb8gkiowRyqY78rXeOgpqVqt+STxAZJZBLdeRvvXMU1KxU/ZZ8gshoDerPz38cfUfFfDaL9Y6L10rnKKhZyfotkVcREz8Y0nrnKKhZi/U76UBuvXMU1KzF+tXtDWBgur0BVEYgAyQhkAGSEMgASQhkgCQEMkASAhkgiVECuVT3fa03oX6l6rhEbg3ey6JU9/1t675tt6ht676ImPRJH5iSUnVcKrcGnyGX6r6v9SbUr1Qdl8qtwQO5VPd9rTehfqXquFRuDR7Ipbrvf61Fn9abUI9SdVwqtwYP5FLd91ts3QdTU6qOS+XW4IFcqvt+i637YGpK1XGp3NJ+E2Bg2m8CVEYgAyQhkAGSEMgASQhkgCQEMkASAhkgickHshacUK/W6nfw9pslacEJ9Wqxfic9Q9aCE+rVYv1OOpC14IR6tVi/owVyicehaMEJ9SpZv6UeOzfaM/WeP/8t1uub2GzuHocy9B+pBSfUq1T9lsqriJECudTjULTghHqVqt9SeRUx0i6LUo9Difh4UQUw1KlE/ZbMq1FmyKUehwLwWCXzapRALvU4FIDHKplXowRyqcehADxWybzyCCeAgXmEE0BlBDJAEgIZIImmArm1Vn5Qo5brdNLtNz/VYis/qE3rddrMDLnFVn5Qm9brtJlAbrGVH9Sm9TptJpC14oT8Wq/TZgJZK07Ir/U6bSaQteKE/Fqv02JHp1erN3Fx8Squrm5iPj+J5fKZ3hZAEUPn0b5Hp4tse9t25N82gd525I8IoQyMKlMeFVmyKNmRH+BTmfKoSCCX7MgP8KlMeVQkkD1BBMgiUx4VCWRPEAGyyJRHRQLZE0SALDLlUfNPDFldX8fF27dxdXsb89kslmdnzex5hNJaqb/U296yaL2zFJSk/u5r5qTeLq13loKS1N99TQdy652loCT1d1/Tgdx6ZykoSf3d13Qgt95ZCkpSf/c1Hcitd5aCktTffc1vewMY2r7b3pqeIQNkIpABkkgXyKvVm1gsXsTx8a+xWLyI1epN6Y8ETET2fEl1Ui9To2hgWmrIl1Qz5EyNooFpqSFfUgVypkbRq+vrWFxexvHr17G4vIzV9fXonwGmIkM9ZcqXr0kVyFkaRW+bnqxvb2MTd01PhDI8XpZ6ypIvD0kVyFkaRWt6Av3JUk9Z8uUhqQI5S6NoTU+gP1nqKUu+PCTVLouIj19a6S9oPpvFesdgabnpCRwqUz1lyJeHpJohZ6HpCfRHPe1PIO+g6Qn0Rz3tT3MhgIFpLgRQGYEMkIRABkhCIAMkIZAPlOFsPmShHvqR7mBIDbZn87fHQbdn8yPCVh6aox76U90MOUOD6Sxn8yGDTPWQIR+6qGqGnKXBdJaz+ZBBlnrIkg9dVDVDztJg+mtn8PW6oEVZ6iFLPnRRVSBnaTDtbD7cyVIPWfKhi6oCOUuDaWfz4U6WesiSD11UtYa8XD77bI0oolyD6fOnTwUw/E+GesiUD4eqaoZcQ4NpoIwp5INubwAD0+0NoDICGSAJgTwwZ/ypmfE7rqp2WdTGGX9qZvyOzwx5QJnO+MNjGb/jE8gDynLGHw5h/I5PIA8oyxl/OITxOz6BPKAsZ/zhEMbv+CYbyBn6omY54w+HyDR+M9TzGCZ5Uu/LvqgRH8+013aMEphGPTd9Um8KfVGBj1qq50kG8hT6ogIftVTPkwzkWvuiOhXFGGobZ7XW8yEmGcjL5bN48uS7z36WvS/q9lTU+vY2NnF3Kip7sVCXGsdZjfV8qEkGco19UZ2KYgw1jrMa6/lQk9xlUaPj169j15U4iogP//rXyJ+GqTLOymh6l0WNnIpiDMZZbgI5CaeiGINxlptATiLTqSimyzjLzRoywMCsIQNURiBXrLYN/gzDOJgOgfw/tXWTqnGDP/2rdRzUVm9jEchx101qvb6JzSZivb6J589/Sz1IatzgT/9qHAc11ttYBHLU2U3K43WIqHMc1FhvYxHIUWc3KRv8iahzHNRYb2MRyFFnNykb/ImocxzUWG9jEchRZzcpG/yJqHMc1FhvY3Ew5H9WqzdxcfEqrq5uYj4/ieXy2SS7SUV8vDN/8fZtXN3exnw2i+XZWeoCbkVL16WleovY/2CIQG7MdpvUp3fmnxwfp59VTZ3rMm1O6rFTjdukWuC6ECGQm1PjNqkWuC5ECOTm1LhNqgWuCxECuTldtknpmbCfQ76nGrev0b9/lP4AjGt7g+ixd/O/vOm07Znw6Xty+Pd06HVhWuyy6KiV7TuLy8tY71jPPJ3N4t3PPxf4RDm19j21Mv672neXhRlyB9smKdtz+dsmKRExuUHpptN+WvqeWhr/Y7GG3EFLTVK63HSqde35kM/d0s25lsb/WARyBy01STn0plO1/XoP/Nwt3ZxrafyPRSB30FKTlEN7JvRx4KHLDPvQ1x76uWvsLXGolsb/WKwhd7BcPvtsDS1i2k1Szp8+fXSwdF1T7bK7o8tru3zuQ76nGrU2/sdghtzB+fmP8fLlL3F6ehJHRxGnpyfx8uUvbmh8ouuaapcZdpfXtrQWfCjjv39myB2dn/9oAD5geXa2s2nOvmuqXWaqXV7b9XO3wvjvlxkyg+q6ptplptrltS2tBZOHGTKD67Km2mWm2nWW28paMHmYIRfkUejf1mWmapa7H+MwD0enC/nylFPExzvUboowJuNwHBrUJ+eUExkYh7kI5EKcciID4zAXgVyIU05kYBzmIpAL8Sh0MjAOcxHIhTjlRAbGYS52WQAMzC4LgMoI5MrZ1N82139aHJ2umEfotM31nx4z5IrZ1N821396BHLFbOpvm+s/PQK5Yjb1t831nx6BXDGb+tvm+k+PQK6YTf1tc/2nx8EQYrV6ExcXr+Lq6ibm85NYLp8p6hH43tux78EQ294aZ+tUGb53drFk0Thbp8rwvbOLQG6crVNl+N7ZRSA3ztapMnzv7CKQG9f31qmp91bo6++zZY1dBHLj+tw6tb1RtV7fxGZzd6NqKqHc599nyxq72PZGbxaLF7Fe318DPT09iXfv/v3o9+tzW1gf79X330c7bHtjdH3eqOpzW1hf7+VGHEOzZEFv+rxR1ee2sL7ey404hiaQ6U2fN6r6nI329V5uxDE0gUxv+rxR1edstK/3ciOOobmpR0pfrvtGfJyNHhKAfb4XHMJDTqlan7NRM1tqYYYMMDAzZIDKCGSAJAQyQBICGSAJgQyQhEAGSEIgAyQhkAGSEMgASQhkgCQEMkASj+plcXR09N+IWA/3cQAm6XSz2fzzW7/0qEAGYDiWLACSEMgASQhkgCQEMkASAhkgCYEMkIRABkhCIAMkIZABkvh/lQ5XRkU/dZIAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<Figure size 432x432 with 1 Axes>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "np.random.seed(0)\n",
    "data, labels = sklearn.datasets.make_circles()\n",
    "idx = np.arange(len(labels))\n",
    "np.random.shuffle(idx)\n",
    "# train on a random 2/3 and test on the remaining 1/3\n",
    "idx_train = idx[:2*len(idx)//3]\n",
    "idx_test = idx[2*len(idx)//3:]\n",
    "X_train = data[idx_train]\n",
    "X_test = data[idx_test]\n",
    "\n",
    "y_train = 2 * labels[idx_train] - 1  # binary -> spin\n",
    "y_test = 2 * labels[idx_test] - 1\n",
    "\n",
    "scaler = sklearn.preprocessing.StandardScaler()\n",
    "normalizer = sklearn.preprocessing.Normalizer()\n",
    "\n",
    "X_train = scaler.fit_transform(X_train)\n",
    "X_train = normalizer.fit_transform(X_train)\n",
    "\n",
    "X_test = scaler.fit_transform(X_test)\n",
    "X_test = normalizer.fit_transform(X_test)\n",
    "plt.figure(figsize=(6, 6))\n",
    "plt.subplot(111, xticks=[], yticks=[])\n",
    "plt.scatter(data[labels == 0, 0], data[labels == 0, 1], color='navy')\n",
    "plt.scatter(data[labels == 1, 0], data[labels == 1, 1], color='c');"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's train a perceptron:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2018-11-19T20:10:18.226327Z",
     "start_time": "2018-11-19T20:10:18.177561Z"
    },
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "accuracy (train):  0.55\n",
      "accuracy (test):  0.41\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/manoel/anaconda3/envs/QISKitenv/lib/python3.6/site-packages/sklearn/linear_model/stochastic_gradient.py:144: FutureWarning: max_iter and tol parameters have been added in Perceptron in 0.19. If both are left unset, they default to max_iter=5 and tol=None. If tol is not None, max_iter defaults to max_iter=1000. From 0.21, default max_iter will be 1000, and default tol will be 1e-3.\n",
      "  FutureWarning)\n"
     ]
    }
   ],
   "source": [
    "from sklearn.linear_model import Perceptron\n",
    "model_1 = Perceptron()\n",
    "model_1.fit(X_train, y_train)\n",
    "print('accuracy (train): %5.2f'%(metric(y_train, model_1.predict(X_train))))\n",
    "print('accuracy (test): %5.2f'%(metric(y_test, model_1.predict(X_test))))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Since its decision surface is linear, we get a poor accuracy. Would a support vector machine with a nonlinear kernel fare better?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2018-11-19T20:10:18.244639Z",
     "start_time": "2018-11-19T20:10:18.230025Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "accuracy (train):  0.64\n",
      "accuracy (test):  0.24\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/Users/manoel/anaconda3/envs/QISKitenv/lib/python3.6/site-packages/sklearn/svm/base.py:196: FutureWarning: The default value of gamma will change from 'auto' to 'scale' in version 0.22 to account better for unscaled features. Set gamma explicitly to 'auto' or 'scale' to avoid this warning.\n",
      "  \"avoid this warning.\", FutureWarning)\n"
     ]
    }
   ],
   "source": [
    "from sklearn.svm import SVC\n",
    "model_2 = SVC(kernel='rbf')\n",
    "model_2.fit(X_train, y_train)\n",
    "print('accuracy (train): %5.2f'%(metric(y_train, model_2.predict(X_train))))\n",
    "print('accuracy (test): %5.2f'%(metric(y_test, model_2.predict(X_test))))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "It performs better on the training set, but at the cost of extremely poor generalization. \n",
    "\n",
    "Boosting is an ensemble method that explicitly seeks models that complement one another. The variation between boosting algorithms is how they combine weak learners. Adaptive boosting (AdaBoost) is a popular method that combines the weak learners in a sequential manner based on their individual accuracies. It has a convex objective function that does not penalize for complexity: it is likely to include all available weak learners in the final ensemble. Let's train AdaBoost with a few weak learners:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2018-11-19T20:10:18.314089Z",
     "start_time": "2018-11-19T20:10:18.248869Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "accuracy (train):  0.65\n",
      "accuracy (test):  0.29\n"
     ]
    }
   ],
   "source": [
    "from sklearn.ensemble import AdaBoostClassifier\n",
    "model_3 = AdaBoostClassifier(n_estimators=3)\n",
    "model_3.fit(X_train, y_train)\n",
    "print('accuracy (train): %5.2f'%(metric(y_train, model_3.predict(X_train))))\n",
    "print('accuracy (test): %5.2f'%(metric(y_test, model_3.predict(X_test))))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Its performance is marginally better than that of the SVM."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# QBoost\n",
    "\n",
    "The idea of Qboost is that optimization on a quantum computer is not constrained to convex objective functions, therefore we can add arbitrary penalty terms and rephrase our objective [[1](#1)]. Qboost solves the following problem:\n",
    "\n",
    "$$\n",
    "\\mathrm{argmin}_{w} \\left(\\frac{1}{N}\\sum_{i=1}^{N}\\left(\\sum_{k=1}^{K}w_kh_k(x_i)-\n",
    "y_i\\right)^2+\\lambda\\|w\\|_0\\right),\n",
    "$$\n",
    "\n",
    "where $h_k(x_i)$ is the prediction of the weak learner $k$ for a training instance $k$. The weights in this formulation are binary, so this objective function is already maps to an Ising model. The regularization in the $l_0$ norm ensures sparsity, and it is not the kind of regularization we would consider classically: it is hard to optimize with this term on a digital computer.\n",
    "\n",
    "Let us expand the quadratic part of the objective:\n",
    "\n",
    "$$\n",
    "\\mathrm{argmin}_{w} \\left(\\frac{1}{N}\\sum_{i=1}^{N}\n",
    "\\left( \\left(\\sum_{k=1}^{K} w_k h_k(x_i)\\right)^{2} -\n",
    "2\\sum_{k=1}^{K} w_k h_k(\\mathbf{x}_i)y_i + y_i^{2}\\right) + \\lambda \\|w\\|_{0}\n",
    "\\right).\n",
    "$$\n",
    "\n",
    "Since $y_i^{2}$ is just a constant offset, the optimization reduces to\n",
    "\n",
    "$$\n",
    "\\mathrm{argmin}_{w} \\left(\n",
    "\\frac{1}{N}\\sum_{k=1}^{K}\\sum_{l=1}^{K} w_k w_l\n",
    "\\left(\\sum_{i=1}^{N}h_k(x_i)h_l(x_i)\\right) - \n",
    "\\frac{2}{N}\\sum_{k=1}^{K}w_k\\sum_{i=1}^{N} h_k(x_i)y_i +\n",
    "\\lambda \\|w\\|_{0} \\right).\n",
    "$$\n",
    "\n",
    "This form shows that we consider all correlations between the predictions of the weak learners: there is a summation of $h_k(x_i)h_l(x_i)$. Since this term has a positive sign, we penalize for correlations. On the other hand, the correlation with the true label, $h_k(x_i)y_i$, has a negative sign. The regularization term remains unchanged.\n",
    "\n",
    "Let us consider all three models from the previous section as weak learners."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2018-11-19T20:10:18.320974Z",
     "start_time": "2018-11-19T20:10:18.316633Z"
    }
   },
   "outputs": [],
   "source": [
    "models = [model_1, model_2, model_3]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We calculate their predictions and set $\\lambda$ to 1. The predictions are scaled to reflecting the averaging in the objective."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2018-11-19T20:10:18.354723Z",
     "start_time": "2018-11-19T20:10:18.323802Z"
    }
   },
   "outputs": [],
   "source": [
    "n_models = len(models)\n",
    "\n",
    "predictions = np.array([h.predict(X_train) for h in models], dtype=np.float64)\n",
    "# scale hij to [-1/N, 1/N]\n",
    "predictions *= 1/n_models\n",
    "\n",
    "λ = 1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We create the quadratic binary optimization of the objective function as we expanded above:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2018-11-19T20:10:18.375760Z",
     "start_time": "2018-11-19T20:10:18.357248Z"
    }
   },
   "outputs": [],
   "source": [
    "w = np.dot(predictions, predictions.T)\n",
    "wii = len(X_train) / (n_models ** 2) + λ - 2 * np.dot(predictions, y_train)\n",
    "w[np.diag_indices_from(w)] = wii\n",
    "W = {}\n",
    "for i in range(n_models):\n",
    "    for j in range(i, n_models):\n",
    "        W[(i, j)] = w[i, j]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We solve the quadratic binary optimization with simulated annealing and read out the optimal weights:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2018-11-19T20:10:18.703378Z",
     "start_time": "2018-11-19T20:10:18.378217Z"
    },
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "import dimod\n",
    "sampler = dimod.SimulatedAnnealingSampler()\n",
    "response = sampler.sample_qubo(W, num_reads=10)\n",
    "weights = list(response.first.sample.values())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We define a prediction function to help with measuring accuracy:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2018-11-19T20:10:18.715360Z",
     "start_time": "2018-11-19T20:10:18.705496Z"
    }
   },
   "outputs": [],
   "source": [
    "def predict(models, weights, X):\n",
    "\n",
    "    n_data = len(X)\n",
    "    T = 0\n",
    "    y = np.zeros(n_data)\n",
    "    for i, h in enumerate(models):\n",
    "        y0 = weights[i] * h.predict(X)  # prediction of weak classifier\n",
    "        y += y0\n",
    "        T += np.sum(y0)\n",
    "\n",
    "    y = np.sign(y - T / (n_data*len(models)))\n",
    "\n",
    "    return y"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2018-11-19T20:10:18.734604Z",
     "start_time": "2018-11-19T20:10:18.719931Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "accuracy (train):  0.65\n",
      "accuracy (test):  0.29\n"
     ]
    }
   ],
   "source": [
    "print('accuracy (train): %5.2f'%(metric(y_train, predict(models, weights, X_train))))\n",
    "print('accuracy (test): %5.2f'%(metric(y_test, predict(models, weights, X_test))))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The accuracy co-incides with our strongest weak learner's, the AdaBoost model. Looking at the optimal weights, this is apparent:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2018-11-19T20:10:18.751765Z",
     "start_time": "2018-11-19T20:10:18.736771Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[0, 0, 1]"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "weights"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Only AdaBoost made it to the final ensemble. The first two models perform poorly and their predictions are correlated. Yet, if you remove regularization by setting $\\lambda=0$ above, the second model also enters the ensemble, decreasing overall performance. This shows that the regularization is in fact important."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Solving by QAOA\n",
    "\n",
    "Since eventually our problem is just an Ising model, we can also solve it on a gate-model quantum computer by QAOA. Let us explicitly map the binary optimization to the Ising model:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2018-11-19T20:10:18.765328Z",
     "start_time": "2018-11-19T20:10:18.754605Z"
    }
   },
   "outputs": [],
   "source": [
    "h, J, offset = dimod.qubo_to_ising(W)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We have to translate the Ising couplings to be suitable for solving by the QAOA routine:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2018-11-19T20:10:19.838597Z",
     "start_time": "2018-11-19T20:10:18.767740Z"
    }
   },
   "outputs": [],
   "source": [
    "from qiskit.quantum_info import Pauli\n",
    "from qiskit.aqua import Operator\n",
    "\n",
    "num_nodes = w.shape[0]\n",
    "pauli_list = []\n",
    "for i in range(num_nodes):\n",
    "    wp = np.zeros(num_nodes)\n",
    "    vp = np.zeros(num_nodes)\n",
    "    vp[i] = 1\n",
    "    pauli_list.append([h[i], Pauli(vp, wp)])\n",
    "    for j in range(i+1, num_nodes):\n",
    "        if w[i, j] != 0:\n",
    "            wp = np.zeros(num_nodes)\n",
    "            vp = np.zeros(num_nodes)\n",
    "            vp[i] = 1\n",
    "            vp[j] = 1\n",
    "            pauli_list.append([J[i, j], Pauli(vp, wp)])\n",
    "ising_model = Operator(paulis=pauli_list)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next we run the optimization:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2018-11-19T20:10:40.568546Z",
     "start_time": "2018-11-19T20:10:19.840830Z"
    }
   },
   "outputs": [],
   "source": [
    "from qiskit.aqua import QuantumInstance\n",
    "from qiskit import Aer\n",
    "from qiskit.aqua.algorithms import QAOA\n",
    "from qiskit.aqua.components.optimizers import COBYLA\n",
    "p = 1\n",
    "optimizer = COBYLA()\n",
    "qaoa = QAOA(ising_model, optimizer, p, operator_mode='matrix')\n",
    "backend = Aer.get_backend('statevector_simulator')\n",
    "quantum_instance = QuantumInstance(backend, shots=100)\n",
    "result = qaoa.run(quantum_instance)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, we extract the most likely solution:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2018-11-19T20:10:40.577140Z",
     "start_time": "2018-11-19T20:10:40.571807Z"
    }
   },
   "outputs": [],
   "source": [
    "k = np.argmax(result['eigvecs'][0])\n",
    "weights = np.zeros(num_nodes)\n",
    "for i in range(num_nodes):\n",
    "    weights[i] = k % 2\n",
    "    k >>= 1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's see the weights found by QAOA:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2018-11-19T20:10:40.597309Z",
     "start_time": "2018-11-19T20:10:40.579449Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([0., 1., 1.])"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "weights"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And the final accuracy:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2018-11-19T20:10:40.614781Z",
     "start_time": "2018-11-19T20:10:40.602793Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "accuracy (train):  0.64\n",
      "accuracy (test):  0.29\n"
     ]
    }
   ],
   "source": [
    "print('accuracy (train): %5.2f'%(metric(y_train, predict(models, weights, X_train))))\n",
    "print('accuracy (test): %5.2f'%(metric(y_test, predict(models, weights, X_test))))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# References\n",
    "\n",
    "[1] Neven, H., Denchev, V.S., Rose, G., Macready, W.G. (2008). [Training a binary classifier with the quantum adiabatic algorithm](https://arxiv.org/abs/0811.0416). *arXiv:0811.0416*.  <a id='1'></a>"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
